External Memory Pipelining Made Easy With TPIE

机译：使用TpIE轻松实现外部存储器流水线操作

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

When handling large datasets that exceed the capacity of the main memory,movement of data between main memory and external memory (disk), rather thanactual (CPU) computation time, is often the bottleneck in the computation.Since data is moved between disk and main memory in large contiguous blocks,this has led to the development of a large number of I/O-efficient algorithmsthat minimize the number of such block movements. TPIE is one of two major libraries that have been developed to supportI/O-efficient algorithm implementations. TPIE provides an interface where liststream processing and sorting can be implemented in a simple and modular waywithout having to worry about memory management or block movement. However, ifcare is not taken, such streaming-based implementations can lead to practicallyinefficient algorithms since lists of data items are typically written to (andread from) disk between components. In this paper we present a major extension of the TPIE library that includesa pipelining framework that allows for practically efficient streaming-basedimplementations while minimizing I/O-overhead between streaming components. Theframework pipelines streaming components to avoid I/Os between components, thatis, it processes several components simultaneously while passing output fromone component directly to the input of the next component in main memory. TPIEautomatically determines which components to pipeline and performs the requiredmain memory management, and the extension also includes support forparallelization of internal memory computation and progress tracking across anentire application. The extended library has already been used to evaluateI/O-efficient algorithms in the research literature and is heavily used inI/O-efficient commercial terrain processing applications by the Danish startupSCALGO.

机译：当处理超过主存储器容量的大型数据集时，主存储器和外部存储器（磁盘）之间的数据移动而不是实际（CPU）计算时间通常是计算的瓶颈。由于数据在磁盘和主磁盘之间移动大块连续块中的内存，这导致了大量I / O高效算法的发展，这些算法可最大程度地减少此类块移动的次数。 TPIE是开发用于支持I / O高效算法实现的两个主要库之一。 TPIE提供了一个接口，在其中可以以简单和模块化的方式实现列表流处理和排序，而不必担心内存管理或块移动。但是，如果不注意，由于通常将数据项列表写入组件之间的磁盘（或从中读取），因此这种基于流的实现可能会导致算法效率低下。在本文中，我们介绍了TPIE库的主要扩展，其中包括一个流水线框架，该框架允许实际有效的基于流的实现，同时将流组件之间的I / O开销最小化。框架使用流水线传输组件，以避免组件之间的I / O，也就是说，它可以同时处理多个组件，同时将一个组件的输出直接传递到主内存中下一个组件的输入。 TPIE自动确定要流水管理的组件并执行所需的主内存管理，该扩展还包括对内部内存计算的并行化和跨整个应用程序的进度跟踪的支持。扩展库已经在研究文献中用于评估I / O高效算法，并且由丹麦初创公司SCALGO大量用于I / O高效商业地形处理应用程序中。

著录项

作者
Arge, Lars; Rav, Mathias; Svendsen, Svend C.; Truelsen, Jakob;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. The Sniffin’ Sticks Odor Discrimination Memory Test: A Rapid, Easy-to-Use, Reusable Procedure for Testing Olfactory Memory [J] . Gerold Besser, Leandra Jobs, David Tianxiang Liu, The Annals of otology, rhinology, and laryngology . 2019,第3期

机译：Sniffin'粘附气味辨别记忆测试：用于测试嗅觉内存的快速，易于使用，可重复使用的程序
2. Why It's Easier to Remember Seeing a Face We Already Know Than One We Don't: Preexisting Memory Representations Facilitate Memory Formation [J] . Reder L.M., Victoria L.W., Manelis A., Psychological science: a journal of the American Psychological Society . 2013,第3期

机译：为什么更容易记得看到一张我们已经知道的面孔比我们不知道的面孔：预先存在的记忆表示促进记忆形成
3. CPC (cyclic pipeline computer)-an architecture suited for Josephson and pipelined-memory machines [J] . Shimizu K., Goto E. IEEE Transactions on Computers . 1989,第6期

机译：CPC（循环管道计算机）-一种适用于约瑟夫森和管道内存机器的体系结构
4. External memory pipelining made easy with TPIE [C] . Lars Arge, Mathias Rav, Svend C. Svendsen, IEEE International Conference on Big Data . 2017

机译：TPIE使外部存储器流水线变得容易
5. Streamlining Big Data Processing Pipelines via Unix Memory Tools, Persistent Spark Datasets, and the Apache Ignite Inmemory File System [D] . Blair, Walter 2018

机译：通过Unix内存工具，持久性Spark数据集和Apache Ignite内存文件系统简化大数据处理管道
6. Correction to: SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor‑powered RNA‑seq analyses [O] . Nicholas J. Eagles, Emily E. Burke, Jacob Leonard, 2021

机译：校正：SpeaQeasy：用于表达分析和R / Biocumond-Power的RNA-SEQ分析的表达分析和定量的可伸缩管道
7. Building a parallel pipelined external memory algorithm library [O] . Andreas Beckmann, Johannes Singler, Roman Dementiev, 2009

机译：构建并行流水线外部存储器算法库

External Memory Pipelining Made Easy With TPIE

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅